All Questions
44 questions
0votes
1answer
37views
Sklearn EstimatorCV vs GridSearchCV
sklearn has the following description for EstimatorCV estimators: https://scikit-learn.org/stable/glossary.html#term-cross-validation-estimator An estimator that has built-in cross-validation ...
0votes
0answers
23views
How to use cross validation to select/evaluate model with probability score as the output?
Initially I was evaluating my models using cross_val with out-of-pocket metrics such as precision, recall, f1 score, etc, or with my own metrics defined in ...
4votes
0answers
92views
Does ROC AUC different between crossval and test set indicate overfitting or other problem?
I am training a composite model (XGBoost, Linear Regression, and RandomForest) to predict injured people probability. Well, the results of cross-validation with 5 folds. Well, I can see any problem ...
0votes
0answers
362views
Split dataframe to a train and test sets with a cross validation of x%
I am working on a dataframe and need to split it into a training set and test set, with 90% for Cross-Validation training, and 10% for a final test set. The problem is that I do not know where to ...
1vote
0answers
112views
How to implement kfold and cv into Hybrid feature selection and evaluate the classification model performance?
I have been working on a Hybrid feature selection combined with hyperopt package for hyperparameter tuning and I am thinking about evaluating the performance of several model classifiers. I looked ...
3votes
2answers
1kviews
Understanding Sklearns learning_curve
I have been using sklearns learning_curve , and there are a few questions I have that are not answered by the documentation(see also here and here), as well as questions that are raised by the ...
0votes
1answer
8kviews
neg_mean_squared_error in cross_val_score [closed]
The string "mean_squared_error" appears to be deprecated in cross_val_score now, and it's saying to use ...
0votes
2answers
553views
What makes the validation set a good representative of the test set? [closed]
I am developing a classification model using an imbalanced dataset. I am trying to use different sampling techniques to improve the model performance. For my baseline model, I defined an AdaBoost ...
11votes
1answer
4kviews
Why you shouldn't upsample before cross validation
I have an imbalanced dataset and I am trying different methods to address the data imbalance. I found this article that explains the correct way to cross-validate when oversampling data using SMOTE ...
1vote
1answer
6kviews
Very low cross-val score for regression with big .corr() between feature and result
Im trying to make regression with sklearn between one feature and one result. This is the dataset that I have: ...
1vote
1answer
188views
How to transform predicted results when doing cross-validation in sklearn?
I want to do cross-validation in sklearn like below, but the predicted result of X still need to be transformed to reduce the distance from y. How to do that with adding a custom function? ...
1vote
2answers
564views
What is meaning of zip(kfold.split(X, Y) in sklearn
What is meaning of zip(kfold.split(X, Y) in sklearn? for (train, test)in zip(kfold.split(X, Y)):
2votes
2answers
455views
Advice and Ideas appreciated - Machine Learning one man project
I have a project where I am supposed to start from scratch and learn how machine Learning works. So far everything is working out better than expected but I feel as I am offered to many ways to choose ...
0votes
1answer
2kviews
sckit-learn Cross validation and model retrain
I want to train a model and also perform cross validation in scikit-learn, If i want to access the model (For instance to see the parameter's selected and weights or to predict) i will need to fit it ...
1vote
1answer
226views
Splitting large multi class dataset using leave one out scheme into train and test
I am doing some supervised learning using neural networks, and i have a Targets array containing 1906 samples, which contain 664 unique values. min. count of each unique value==2, by design. Is there ...